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1.  INTRODUCTION 


We  consider  least  squares  model  fitting  tasks  that  can  be  formulated 
as  constrained  minimization  problems.  In  general,  the  constraints,  i.e., 
the  model  equations,  are  implicit  nonlinear  relations  between  observables 
and  parameters.  The  solution  of  such  problems  can  be  found  by  the  Lagrange 
multiplier  technique  which  provides  a  system  of  coupled  nonlinear  normal 
equations.  The  equations  can  be  used  to  analyze  the  sensitivity  of  the 
solution  to  data  perturbations,  and  to  obtain  numerical  values  of  optimal 
residuals  and  parameters.  However,  the  numerical  solution  of  the  system 
is  not  necessarily  trivial,  because  the  size  of  the  system  is  proportional 
to  the  number  of  data.  Because  of  the  potentially  very  large  size  of 
the  equation  system,  a  partitioning,  if  possible,  has  many  algorithmic 
advantages.  Fortunately,  many  typical  least  squares  model  fitting  prob¬ 
lems  have  such  a  norm. ’  equation  structure  that  a  partitioning  of  the 
equation  system  can  be  achieved  by  proper  manipulations  of  the  model 
equations.  Particularly  effective  can  be  a  manipulation  of  parameters, 
e.g.,  an  introduction  of  new  parameters  and/or  a  formal  elimination  of 
some  parameters.  We  will  consider  the  effects  of  such  manipulations 
and  derive  partitionability  conditions  which  can  be  used  as  a  basis  for 
a  rational  choice  of  model  equation  formulation,  and  for  a  rational 
planning  of  experiments. 

In  Section  2  we  will  give  a  formal  definition  of  the  least  squares 
model  fitting  problem,  and  establish  the  normal  equations.  Algorithms 
for  the  numerical  solution  of  the  equations  will  be  outlined  in  Section  3, 
where  also  the  sensitivity  of  the  solution  to  data  perturbations  Is 
investigated.  Partitionability  of  the  normal  equations  and  corresponding 
parameter  manipulation  are  discussed  in  Sections  4  and  5,  repectively. 
Section  6  brings  an  example  for  partitioning  of  normal  equations  arising 
in  the  adjustment  of  a  planimetric  traverse. 


2.  THE  MODEL  FITTING  PROBLEM 

Let  the  mathematical  model  of  an  observable  event  be  formulated  by 
a  set  of  r  independent  equations,  say. 


F(x ,  t)  =  0  , 


(2.1) 


where  F  e  R  is  a  vector  function,  x  e  R  represents  potential  observations, 
and  t  e  RP  is  a  vector  of  free  parameters.  Eqs.  (2.1)  may  be  prescribed 
by  a  theory  of  the  event,  or  chosen  by  other  considerations.  Regardless 
of  their  source,  the  equations  are  considered  as  a  given  description  of 
the  event.  A  model  fitting  problem  arises  when  one  seeks  to  determine 
the  validity  of  the  mathematical  description  by  testing  Eqs.  (2.1)  with 
experimental  data.  In  such  tests,  typical  magnitudes  of  the  dimensions 
p,  r,  and  n  are  10,  100,  and  1000,  respectively.  We  assume  that  the 
dimensions  satisfy  the  inequalities 
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0  1  P  <  1  n, 


(2.2) 


which  assure  that  the  optimization  problem  has  sufficient  degrees  of 
freedom. 

We  restrict  our  considerations  to  model  functions  F  that  are  twice 
differentiable  with  respect  to  all  its  n  +  p  arguments.  Differentiability 
of  model  functions  is  typical  for  many  problems  in  engineering,  physics 
and  other  fields  of  application.  For  the  present  analysis,  it  has  to  be 
assumed  only  within  a  neighborhood  of  the  adjusted  observations  x  and  of 
the  optimal  parameter  t. 

Let  X  e  Rn  be  the  vector  of  the  actual  observations.  Because  X 
contains  observational  inaccuracies  one  cannot  expect  that  the  theoret¬ 
ical  description  of  the  event,  i.e.,  Eq.  (2.1),  is  satisfied  at  x  *  X. 
Instead,  one  needs  corrections  (residuals)  c  which  are  added  to  the 
observations  so  that  Eq.  (2.1)  is  replaced  by 


F (X  +  c,t)  *  0 


(2.3) 


Eq.  (2.3)  means  that  we  expect  the  theoretical  relations  between  observ¬ 
ables  and  parameters  to  be  satisfied  at  a  vicinity  X  +  c  of  the  actual 
observations  X. 

The  mathematical  goal  of  model  fitting  is  to  find  residuals  c  and 
parameters  £  that  are  optimal  in  some  sense.  We  do  not  put  any  restric¬ 
tions  on  t,  but  obviously  would  like  the  residuals  c  to  be  as  small  as 
possible.  Hence,  a  general  model  fitting  problem  can  be  formulated  as 
the  following  constrained  minimization  task 


| jc| j  =  min.  ,  (2.4a) 

F (X  +  c,t)  =  0  .  (2.4b) 


The  solution  of  this  problem,  of  course,  depends  on  the  definition  of 
the  norm  | |c||.  In  this  article  we  consider  only  elliptic  vector  norms, 
defined  by 


|  |c|  |  =  >/cTMc 


(2.5) 
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where  M  is  a  positive  definite  norm  matrix.  One  reasonable  choice  of  M 
is  a  diagonal  matrix  where  the  elements  are  proportional  to  the  inver 
squares  of  estimated  standard  errors  of  the  observed  components  of  X. 
With  this  choice,  the  norm  ||c||  becomes  dimensionless  and  the  corres¬ 
ponding  minimization  problem  is  called  "weighted  least  squares."  A 
more  general  choice  for  M  is  the  Inverse  of  an  estimated  variance-covar¬ 
iance  matrix  R  of  the  observations  X.^»4  If  the  observational  errors 
of  X  are  normally  distributed,  then  the  use  of  this  norm,  i.e.,  of 


(2.6) 


in  Eq.  (2.4a)  produces  a  maximum  likelihood  result.  The  norm  (2.6)  is, 
of  course,  also  dimensionless,  and  it  includes  the  weighted  least  squares 
norm  as  a  special  case. 

Using  the  norm  (2.6)  we  formulate  a  general  least  squares  model 
fitting  problem  as  follows: 

W  =  j  | c |  | 2  =  cTR-1c  *  min.  ,  (2.7a) 


F(X  +  c,t)  »  0  . 


(2.7b) 


The  unknowns  of  the  problem  are  the  residuals  c  and  the  parameter  vec¬ 
tor  t.  Given  are  the  observations  X,  the  model  function  F  and  an  esti¬ 
mated  variance-covariance  matrix  R.  The  latter  need  to  be  known  only 

1 W.E .  Deming ,  "Statistical  Adjustment  of  Data,"  John  Wiley  &  Sons,  New 
York,  NY,  1944. 

2 

S.  Brandt,  "Statistical  and  Computational  Methods  in  Data  Analysis," 
North-Holland  Publishing  Co.,  Amsterdam,  1970. 

J 

D.  Brown,  "A  Matrix  Treatment  of  Least  Squares  Considering  Correlated 
Observations,"  USA  Ballistic  Research  Laboratories  Report  No.  937 ,  1955. 
(AD  #71209) 

4 

J.M.  Tvenstra,  "Theory  of  the  Adjustment  of  Normally  Distributed  Obser¬ 
vations,"  N.V.  Uitgevery  "Argus,"  Amsterdam,  1956. 

. M.F .  Britt  and  R.H.  Luecke,  "The  Estimation  of  Parameters  in  Nonlinear, 
Implicit  Models,"  Technometrics,  15j  233-247,  1973. 
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up  to  a  factor,  because  the  Inclusion  of  an  arbitrary  factor  in  Eqs.  (2.7a) 
does  not  change  the  minimum  condition.  Also,  the  model  function  F  can 
be  manipulated  as  long  as  the  result  produces  a  system  of  equations 
mathematically  equivalent  to  Eqs.  (2.7b). 

Problem  (2.7)  can  be  reduced  to  an  unconstrained  minimization  prob¬ 
lem  if  the  residuals  c  can  be  eliminated  from  Eqs.  (2.7a)  by  using 
Eqs.  (2.7b).  This  is  the  case,  e.g.,  when  Eqs.  (2.7b)  are  explicit  in 
terms  of  X  +  c.  Many  least  squares  algorithms  have  been  devised  to 
treat  this  special  problem.  In  this  article,  we  consider  the  more  gen¬ 
eral  situation  where  F  is  a  general  nonlinear  function  such  that  a 
formal  elimination  of  all  or  some  residuals  is  either  not  possible  or 
not  practical. 

In  order  to  simplify  our  notation  in  the  subsequent  analysis,  we 
shall  denote  derivatives  of  F  by  subscripts.  Thus,  e.g.. 


3F  (X  +  c ,  t) 
3X 


FX(X  +  c,t) 


is  a  rxn  matrix,  and 


32(KTF(X  +  c,  t)) 
3X3t 


(KTF(X  +  c,t)), 
Ki 


x 

where  K  e  R  ,  is  a  nxp  matrix. 

In  addition  to  the  differentiability  of  F  we  also  assume  that  in 
a  neighborhood  of  the  solution  (X  +  c,t)  the  following  rank  conditions 
are  satisfied: 


rank  F  »  r 

A 


(2.8) 


and 


rank  F  ■  p  . 


(2.9) 


The  condition  (2.8)  insures  that  the  model  Eqs.  (2.3)  are  independent. 

The  conditon  (2.9)  excludes  model  formulations  with  redundant  parameters. 
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Next,  we  obtain  normal  equations  for  the  problem  (2.7)  using 
Lagrange  multipliers.  Let  k  e  Rr  be  a  vector  of  correlates  (Lagrange 
multipliers) ,  and  let 


2 f  =  1/2  cTR_1c  -  kTF  (X  +  c,  t) 


(2.10) 


be  the 
respect 


modified  object  function.  By  setting  the  derivatives  of  ft  with 
to  c,  t  and  k  equal  to  zero,  we  obtain  the  equations. 5»6,7,8,9 


c  -  R  •  F  T  (X  +  c,  t)  •  k  *  0  ,  (2.11a) 

kT  •  F  (X  +  c,  t)  =  0  ,  (2.11b) 

F (X  +  c,  t)  «  0  .  (2.11c) 


The  normal  Eqs.  (2.11)  generally  have  more  than  one  solution,  and  the 
solution  of  the  optimization  problem  (2.7)  is  one  of  the  several  solutions 
of  Eqs.  (2.11).  The  selection  of  the  proper  solution  is  normally  done 
by  an  analysis  of  problem  related  background  information.  We  shall  not 
discuss  such  analyses  in  this  article,  and  concentrate  instead  on  the 
finding  of  any  numerical  solution  of  Eq.  (2.11). 

3.  ITERATION  ALGORITHMS  AND  EFFECTS  OF  DATA  PERTURBATION 

The  normal  Eqs.  (2.11)  are  nonlinear  with  respect  to  the  unknowns 
c  and  t.  Therefore,  their  numerical  solution  will  generally  require  an 
iteration.  One  obtains  second  order  Newton-type  iteration  equations  by 
expanding  the  normal  equations  at  an  approximate  solution. 


6 

A.  CelmiyS,  " Least  Squares  Adjustment  with  Finite  Residuals  for  Ron- 
Linear  Constraints  and  Partially  Correlated  Data, "  USA  Ballistic  Research 
Laboratories  Report  No.  R-1658,  1973.  (AD  #766283) 

7 

A.F.  Pope,  "Two  Approaches  to  Nonlinear  Least  Squares  Adjustments, " 

The  Canadian  Surveyor,  28j  663-669,  1974. 


Q 

R.M.  Passi,  "Use  of  Nonlinear  Least  Squares  in  Meteorological  Appli¬ 
cations,"  Journal  of  Applied  Meteorology,  16_,  828-832,  1977, 
and  17j  1579-1580,  1978. 

g 

W.H.  Jefferys,  "On  the  Method  of  Least  Squares , "  The  Astronomical 
Journal,  85j  177-182,  1980. 
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Let  C,  K  and  T  be  approximations  to  the  solution  vectors  c,  k  and 
t,  respectively,  and  e,  x  and  t  be  the  corresponding  corrections.  The 
linear  terms  of  an  expansion  of  Eq.  (2.11)  at  the  approximate  solution 
produce  the  following  system  of  Newton-Raphson  equations  for  the  cor¬ 
rections^  : 


[i-R  •(KT-F)xxJ-e  -  R-FxT-(K+ic)  -  R-(KT*F)Xt;-T  =  -C 

(KT-F)tX-e  +  FtT-(K+x)  +  (KT-F)tt  -t  =  0 

F  *e  +  F  *T  =  -F 
A  t 


(3.1) 


The  arguments  of  F  and  of  its  derivatives  in  Eq.  (3.1)  are  the  approxi¬ 
mations  X  +  C  and  T. 

An  iteration  based  on  Eq.  (3.1)  proceeds  by  computing  the  corrections 
e,  k  and  t,  adding  them  to  the  approximations  C,  K  and  T,  respectively, 
and  repeating  the  process.  The  equations  may  be  rearranged  into  a  more 
convenient  form  for  the  iteration.  An  example  of  such  rearranged  iter¬ 
ation  equations  is  given  in  the  Appendix  and  corresponding  computer 
programs  are  described  in  Reference  10. 


An  often  used  variation  of  the  Newton-Raphson  equations  is  obtained 

by  setting  in  Eq.  (3.1)  all  second  order  derivatives  of  F  equal  to 
1 » 2 ,5 ,9 


zero. 


The  resulting  equations  are  called  Gauss-Newton  equations, 
rithms  based  on  Gauss-Newton, 
linearly  and  may  have  other  disadvantages. 


Iteration  algorithms  based  on  Gauss-Newton^ equations  converge  only 


The  linear  terms  of  the  expansion  of  the  normal  equations,  i.e., 
the  equation  system  (3.1),  also  provide  a  means  to  obtain  estimates  of 
the  effects  of  data  perturbations  on  the  solution.  To  this  end,  we 
express  the  normal  equations  in  terms  of  the  corrected  observations 
x  =  X  +  c,  obtaining  the  system 


x  -  R*FxT(x,t) -k*  X  ,  . 

kT*Ft(x,t)  -  0  ,  >  (3.2) 

F(x,t)  =  0  ,  ' 

^°A.  CelmiyS,  "A  Manual  for  General  Least  Squares  Model  Fitting 3" 

USA  Ballistic  Research  Laboratory  Report  ARBRL-TR-02167 ,  1979. 

(AD  #B040229L) 
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which  we  expand  at  the  solution.  The  linear  terms  of  the  expansion 
yield  the  following  relation  between  the  differentials  of  the  solution 
x,  k,  t  and  the  differentials  of  the  observations  X: 


[i-R* (kT*F)  ]  dx  -  R-F  Tdk  -  R-(kT*F)  dt  -  dX  ,  i 

L  XX J  X  xt  j 

(kT*F)txdx  +  FtTdk  +  (kT-F)ttdt  -  0  ,  |  (3.3) 

F  dx  +  F  dt  *  0  . 
x  t 

The  coefficient  matrices  in  Eq.  (3.3)  are  identical  to  those  in 
Eq.  (3.1),  except  that  now  the  functions  are  evaluated  at  the  solution. 
Therefore,  differential  changes  of  the  solution  corresponding  to  data 
perturbations  dX  can  be  calculated  conveniently  by  using  the  iteration 
equations  of  the  Appendix.  Thus,  if  one  is  interested  in  the  changes 
dt  of  the  parameters  corresponding  to  the  perturbations  dX,  one  can  use 
the  formula 


N  dt  »  S  dX  , 


(3.4) 


where  N  and  S  are  defined  in  the  Appendix  in  terms  of  F  and  its  derivatives. 

Eq.  (3.4)  also  can  be  used  to  derive  a  formula  for  an  estimate  of 
the  variance-covariance  matrix  Vt  of  the  components  of  the  parameter 
vector  t.  The  formula  is  obtained  by  applying  the  law  of  variance  pro¬ 
pagation  to  Eq.  (3.4)  with  the  result^* 2 *10 

V  =  N_1SRST(N'1)T  .  (3.5) 

In  case  the  variance-covariance  matrix  R  of  the  data  has  been 
estimated  only  up  to  a  factor,  the  formula  must  be  supplemented  by  an 
estimate  of  that  factor.  The  usual  estimate  is 

2  1  T_-l  ,,  c\ 

m  »  -  c  R  c  .  (3.6) 

o  n-p 

The  square  root  m  of  the  factor  is  also  called  the  standard  error  of 
weight  one. 
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It  is  clear  from  the  derivation  of  Eq.  (3.5)  that  the  formula  for 
the  variances  of  t  contains  first  and  second  order  derivatives  of  the 
model  function  F  in  spite  of  the  fact  that  the  formula  is  only  a  first 
order  estimate  of  the  variances.  (The  dependence  on  the  second  order 
derivatives  is  shown  explicitly  in  the  Appendix.)  As  noted  in  Reference  7, 
the  reason  for  the  presence  of  second  order  derivatives  in  Eq.  (3.5)  is 
the  appearance  of  first  order  derivatives  in  the  normal  Eq.  (3.2)  which 
are  perturbed  and  expanded  when  Eq.  (3.5)  is  derived.  Authors  who  prefer 
Gauss-Newton  algorithms  for  the  numerical  solution  of  the  normal  equations 
tend  to  overlook  this  fact  and  present  variance  estimate  formulas  without 
second  order  derivatives  of  the  model  function  f1»2»5,8,9.  Such  formulas 
are  less  than  first  order  accurate  and  should  not  be  used  without  an 
estimate  of  the  effect  of  the  neglected  terms.  One  can  easily  construct 
examples  where  the  second  order  derivative  terms  contribute  signifi¬ 
cantly  to  V  either  increasing  or  decreasing  the  estimated  variances. 


4 .  PARTITI0NABIL1TY 

If  the  data  volume  is  large,  then  the  numerical  solution  of  the 
normal  Eqs.  (2.11)  can  be  a  formidable  task.  In  a  typical  model  fitting 
problem  the  dimension  n  of  the  observations  is  the  order  1000  or  larger, 
and  one  has  to  manipulate  matrices  of  the  order  nxn  in  the  Newton-Raphson 
iteration  equations.  Fortunately,  many  least  squares  problems  have  such 
a  structure  that  the  large  systems  of  equations  can  be  partitioned  into 
a  set  of  smaller  systems,  whereby  the  manipulation  of  the  large  matrices 
can  be  avoided. 

We  illustrate  this  partitionability  with  a  curve  fitting  problem 
in  the  y,z-plane.  Let  the  curve  be  defined  by  the  implicit  equation 


f(x;t)  =  f (y,z;t)  -  0  , 


(4.1) 


where  x  is  the  coordinate  vector  in  the  y,z-plane  and  t  is  a  parameter 
vector.  Let  the  observations  X^,  (i  =  l,2,...,s)  be  the  coordinates  of 
s  points  in  the  y,z-plane,  i.e.. 


X 


i 


i  =  1,2, 


(4.2) 


and  let  the  accuracies  of  the  observed  points  be  characterized  by  corres¬ 
ponding  estimated  variance-covariance  matrices 


R 


i 


Viyz  \ 
Vizz  / 


1,2, ...,s 


(4.3) 
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The  correspondences  between  this  problem  and  the  general  model  fitting 
problem  are  as  follows 


(A. 4) 


(4.5) 


(4.6) 


Using  these  correspondences,  the  curve  or  model  fitting  problem  (2.7) 
can  be  defined  in  terms  of  the  subsets  X^  of  X,  the  submatrices  of  R, 
and  the  components  f  of  F  as  follows: 


s 

W  =  E  cTr”  c.  =  min.  | 

i-1  1  I 

[  (4.7) 

f(Xi+ci,t)  =  0,  i=l,2,...,s  .  J 

The  normal  equations  of  the  optimization  problem  (4.7)  are 

c^  -  R^*f^  (X^+c^,t)*k^  =  0,  i=l,2,...,s  (4.8a) 

s 

E  ki*ft(Xi+ci,t)  "  0  (4,8b) 

i=l 

f(X1+c1,t)  -  0,  i-1, 2 . s  (4.8c) 

where  (i-1, 2 . s)  are  the  correlates  of  the  problem. 
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A  comparison  of  Eqs.  (4.8)  with  the  general  normal  Eqs.  (2.11) 
shows  that  in  the  curve  fitting  problem  the  large  equation  system  is 
partitioned  into  a  set  of  smaller  systems,  so  that  the  largest  dimension 
of  matrices  to  be  manipulated  is  the  maximum  of  2x2  an£  pxp.  Particularly, 
Eq.  (4.8a)  is  a  set  of  s  systems  of  two  equations,  each  system  depending 
only  on  two  residuals,  whereas,  Eq.  (2.11a)  is  one  syJjbem  of  2s  equations 
depending  on  2s  residuals.  Likewise,  Eq.  (4.8c)  are  s* scalar  equations, 
each  depending  on  two  distinct  residuals,  whereas  the  corresponding 
Eq.  (2.11c)  is  a  system  of  s  coupled  equations  for  all  2s  residuals. 
Obviously,  the  numerical  treatment  of  Eqs.  (4.8)  is  much  simpler  than 
that  of  Eqs.  (2.11). 

A  basic  property  of  the  sample  problem  (4.7)  is  that  the  r  con¬ 
straints  are  scalar  equations,  (hence  r^  ■  1  and  r  »  Er^  -  s)  each 
depending  on  a  distinct  subset  of  the  observation  vector  X,  and  that 
the  s  subsets  X^  are  not  correlated.  We  call  such  a  problem  a  standard 
least  squares  problem  because  of  its  common  occurance  and  simplicity. 
Standard  least  squares  problems  are  easier  to  solve  numerically  than 
general  problems,  because  the  maximum  dimensions  of  matrices  in  the  normal 
equations  are  independent  of  the  total  number  of  observations.  A  prob¬ 
lem  with  the  latter  property  we  call  totally  partltionable.  Hence,  a 
standard  least  squares  problem  is  totally  partltionable.  If  the  data 
are  not  correlated,  then  any  fitting  of  a  hyper surface  to  points  in  a 
space  of  observables  is  a  totally  partltionable  problem.  Such  a  fitting 
in  a,  say,  m-dimensional  space  is  also  a  standard  problem  if  the  dimension 
of  the  hypersurface  is  m-1. 

Next,  we  derive  conditions  for  partitionability  of  the  normal 
equations  by  comparing  the  structures  of  Eqs.  (2.11)  and  (4.8).  First, 
we  notice  that  in  order  to  be  ablj  to  partition  Eq.  (2.11c)  at  all,  the 
model  function  F  must  be  transformable  into  such  a  form  that  subsets  of 
components  of  F  depend  on  distinc:  subsets  of  the  observations  X.  This 
property  can  be  conveniently  expressed  by  the  requirement  that  the 

stretched  block  diagonal  structure,  i.e.. 


(4.9) 


problem  (4.7),  F  has  this  property,  whereby 
the  two- component  vectors  3f/3X^.  In  more 
general  situations  the  submatrices  lave  dimensions  r^xn^  and  the  r^  and 
n^  can  be  different  for  different  indexes  i.  Because  3F/3X  is  a  rxn 
matrix,  then  obviously  Ir^  *  r  and  I'n^  «  n. 
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Jacobian  matrix  3F/3X  has  a 


In  the  sample  curve  fitting 
the  submatrices  3Fi/3Xi  are 


The  stretched  block  diagonal  structure  (4.9)  of  3F/3X  suffices  to 
partition  Eq.  (2.11c).  In  order  to  partition  Eq.  (2.11a)  too,  one  needs 
an  additional  condition  on  the  variance-covariance  matrix  R.  If  R  Is 
diagonal  and  (4.9)  holds,  then  Eq.  (2.11a)  Is  partltlonable.  However, 
for  partltionabillty  It  Is  already  sufficient  If  R  has  a  block  diagonal 
structure 


R 


(4.10) 


where  the  dimensions  n^  of  the  submatrices  R^  match  the  dimensions  nj 
of  the  submatrices  SF^/SX^.  Both  of  these  conditions  together  are  suf¬ 
ficient  to  partition  the  problem  into  s  parts.  Thus,  if  R  and  3F/3X 
have  the  indicated  structures,  Eq.  (2.11a)  has  the  form 


-  0  ,  (4.11) 


where  the  c^  are  distinct  subsets  of  c  with  the  dimensions  n^,  and  the 
k^  are  correlate  vectors  with  the  dimensions  r^. 

In  summary,  sufficient  for  the  partltionabillty  of  a  least  squares 
model  fitting  problem  is  that  the  following  two  conditions  hold: 

a.  R  has  a  block  diagonal  structure  (4.10),  and 

b.  3F/3X  has  a  matching  stretched  block  diagonal  structure  (4.9). 

In  data  reduction  problems  one  has  no  control  over  the  structure  of 
R,  except  during  the  planning  stage  of  an  experiment,.  Once  the  measure¬ 
ments  are  made,  R  is  part  of  the  given  data  basis.  However,  in  many 
practical  problems  R  is  diagonal  or  nearly  diagonal  with  few  non-zero 
off-diagonal  elements.  In  these  cases,  the  partltionabillty  of  the  prob¬ 
lem  depends  on  the  formulation  of  the  model  equations  F  ■  0.  By  a  proper 
manipulation  of  the  constraints  one  can  often  partition  a  problem  that 
was  not  partltlonable  in  the  original  formulation.  We  shall  give  an 
example  of  such  a  manipulation  in  Section  6. 
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The  algorithmic  advantages  of  partitioning  cannot  be  overemphasized. 
In  fact,  partitionability  rather  than  the  special  form  of  the  object 
function  W  is  the  practically  important  difference  between  a  least  squares 
problem  and  a  general  optimization  problem.  Most  published  algorithms 
for  the  solution  of  least  squares  problems  are  restricted  to  partitionable 
cases. 


5.  MANIPULATION  OF  PARAMETERS 

This  section  gives  an  overview  of  parameter  manipulations  that  can 
be  used  to  achieve  partitionability  of  least  squares  model  fitting  prob¬ 
lems.  First,  we  express  the  constraint  Eqs.  (2.3)  in  the  more  convenient 
form 


F  (c,t)  *  0 


(5.1) 


by  including  the  observed  X  into  the  definition  of  the  model  function  ¥. 
Like  Eq.  (2.3),  Eq.  (5.1)  is  a  set  of  r  scalar  equations.  In  general, 
each  of  the  r  equations  depends  on  different  subsets  of  the  components 
of  c  and  t.  A  subset  te  of  the  parameter  vector  t  we  define  as  the  vec¬ 
tor  of  essential  parameters  if  all  components  of  te  appear  in  all  r 
Eqs.  (5.1).  All  other  parameters  we  call  non-essential.  A  constraint 
formulation  that  contains  only  essential  parameters  we  call  a  minimal 
formulation. 

Minimal  formulations  of  constraints  are  important  in  the  context  of 
partitionability.  To  illustrate  this,  we  notice  that  the  numbers  and 
types  of  parameters  are  not  intrinsic  properties  of  a  model  fitting 
problem.  Parameters  can  be  eliminated  and  added,  within  limits,  without 
changing  the  solution  of  the  problem.  However,  from  an  algorithmic 
viewpoint  it  is  not  advisable  to  eliminate  essential  parameters,  i.e., 
to  reduce  the  problem  formulation  below  a  minimal  formulation. 

We  illustrate  this  remark  by  considering  a  standard  problem  with  a 
diagonal  variance-covariance  matrix  R  and  only  essential  parameters. 

Let  the  constraints  (5.1)  of  the  problem  be,  componentwise. 


fi  “  °»  i“l,2,...,r  .  (5.2) 

In  order  to  eliminate  the  parameters  we  may  use  the  last  p  Eqs.  (5.2) 
and  express  t  in  terms  of  the  residual  subset  ce  ■  (cr_p+i,  ...»  cr) . 
Substituting  this  expression  into  the  first  r-p  equations,  one  obtains 
a  system  of  constraints,  equivalent  to  the  original  system,  but  without 
parameters,  namely. 
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$(c)  -  0 


(5.3) 


with  che  components 


^i(ci,ce)  “  0.  i“l» • • • *r-p  ,  (5.4) 

Now,  Che  arguments  of  the  components  l!±  of  ^  are  not  distinct  subsets 
of  c  and,  therefore,  the  Jacobian  matrix  3^/ 3c  has  not  the  necessary 
stretched  block  diagonal  form.  The  problem  is  not  partitionable  in  this 
formulation. 

While  a  problem  formulation  with  fewer  constraints  than  in  a  mini¬ 
mal  formulation  certainly  is  not  optimal  for  a  numerical  treatment,  one 
may  find,  in  some  cases,  that  larger  than  minimal  formulations  are  more 
practical.  One  obtains  such  formulations  by  introducing  new  parameters 
into  the  problem.  The  total  number  of  parameters  that  can  be  added  to 
a  least  squares  problem  is,  however,  limited  by  the  inequalities  (2.2). 

Let  us  assume  that  for  a  given  problem  the  inequalities  are  satisfied, 
and  let  us  introduce  £  new  parameters  into  the  problem.  The  corresponding 
p  new  equations  which  define  the  parameters  are  added  to  the  set  of  con¬ 
straint  equations.  Therefore,  the  inequality  (2.2)  for  the  new  problem 
formulation  is 


0<  p+f  <  r  <n  .  (5.5) 

Hence,  the  number  p  of  new  parameters  that  can  be  introduced  into  a 
problem  is  limited  by  the  condition 

P  <_  n  -  r  .  (5.6) 


A  "natural,"  application  oriented,  formulation  of  a  model  fitting 
problem  is  not  necessarily  the  most  advantageous  one  for  numerical  treat¬ 
ment,  and  may  even  be  sub-minimal,  as  in  the  following  example.  Suppose 
that  some  effect  y  of  an  explosion  has  been  observed  at  different  sta¬ 
tions  as  a  function  of  time  t.  Then  the  data  basis  consists  of  an  obser¬ 
vation  T0  of  the  time  tQ  of  the  explosion  and  of  a  series  of  pairs  (T,Y), 
providing  the  observed  effects  Y  at  times  T.  Let  the  theoretical  model 
equation  of  the  observed  effect  at  station  j  be 


y  *  f j (t  -  tQ;  a,  8,  y) 


(5.7) 
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where  a,  B,  and  y  are  essential  model  parameters.  Then  the  corresponding 
constraint  equations  are 


Y1  +  cYi 


-  * 


'Ti 


-  (T  + 

o 


CTo); 


e,  y)=0,  (i  -  1,  2, 


r) 


(5.8) 


The  formulation  (5.8)  is  minimal,  because  all  parameters  are  essen¬ 
tial.  However,  the  problem  is  not  partitionable  even  when  all  observa¬ 
tions  are  uncorrelated,  because  all  constraint  equations  contain  the 

same  observation  T  . 

o 

A  partitioning  of  this  problem  can  be  easily  achieved  by  int  oducing 
the  starting  time  of  the  explosion  as  a  fourth  parameter  6.  The  cor¬ 
respondingly  modified  set  of  constraint  Eqs.  (5.8)  is 


+  cyi  -  f(T±  +  cTi  -  6;  o,  6,  y)  -  0  ,  (i  -  1, 


T 

o 


+  c 


To 


6  =  0. 


Now  we  have  r  +  1  constraints,  four  parameters  and  (m  +  1)  •  r  +  1 
observations,  where  m  =  dimY.  The  inequalities  (5.5)  are  in  this  case 


0  <  4  <  r  +  1  <  (m  +  1)  *r  +  1 


(5.10) 


indicating,  that  one  needs  at  least  four  observation  sets  (T,Y)  to  have 
a  regular  adjustment  problem. 

The  problem  formulation  (5.9)  is  not  minimal,  because  the  param¬ 
eters  a,  B,  and  y  are  not  essential.  However,  because  they  appear  in 
all  but  one  of  the  constraint  equations,  their  elimination  would  not 
simplify  the  problem.  If  the  data  are  not  correlated,  then  in  the  form¬ 
ulation  (5.9)  the  problem  is  totally  partitionable.  It  is,  in  fact,  a 
standard  problem,  if  the  observed  effects  Y^  are  scalar,  i.e.,  if  m  =  1. 

The  goal  of  manipulation  of  the  model  equations  is  to  obtain  an 
equation  system  with  a  stretched  block  diagonal  Jacobian  matrix.  If  the 
model  equations  are  linear,  then  this  can  be  achieved  by  algebraic  man¬ 
ipulations.  For  problems  with  nonlinear  implicit  model  equations,  the 
probably  most  effective  approach  is  through  parameter  manipulation,  such 
as  shown  in  the  previous  example.  A  numerical  example  of  another  prob¬ 
lem  will  be  given  in  the  next  section. 
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6.  EXAMPLE 


We  present  as  a  numerical  example  a  least  squares  model  fitting 
problem  arising  from  the  adjustment  of  a  planimetric  net.  The  measure¬ 
ments  in  such  a  net  have  to  satisfy  net  closure  conditions,  i.e.,  con¬ 
straints  without  any  parameters.  The  constraints  form  a  set  of  simul¬ 
taneous  nonlinear  equations  involving  all  corrections,  and  net  adjustment 
problems  are  not  partitionable  in  this  formulation.  However,  by  intro¬ 
ducing  station  coordinates  as  parameters  net  adjustment  problems  always 
can  be  partitioned. 

A  simple  specific  example  is  the  planimetric  traverse  shown  in 
Figure  1.  Let  the  observations  be  the  distances  r^  between  the  stations 
and  the  corresponding  azimuths  $£.  The  constraints  for  the  closed 
polygon  of  Figure  1  are  obtained  from  the  model  equations  (closure  con¬ 
ditions) 


5 

L  risin*i  °  0 

i=i 

and 

5 

y  r^cosi^  ■  0 
i=l 


(6.1) 


by  substituting  in  them  the  corrected  r^  +  cri  and  <f>i  +  c<j,i  for  the 
observed  r^  and  ,  respectively.  Hence,  the  problem  has  two  nonlinear 
scalar  model  equations  (r-2) ,  ten  observations  (n=10) ,  and  no  param¬ 
eters  (p*0) .  For  simplicity,  we  assume  that  the  observations  are  not 
correlated,  i.e.,  that  the  estimated  variance-covariance  matrix  R  is 
diagonal.  Nevertheless,  the  adjustment  problem  is  not  partitionable 
in  this  formulation. 

Next,  we  introduce,  as  parameters,  the  coordinates  of  the  stations 
1,  2,  3,  and  4  relative  to  the  reference  station  A.  With  these  eight 
parameters,  the  two  original  model  Eqs.  (6.1)  can  be  expressed  equiva¬ 
lently  by  the  following  five  sets  of  two  equations  each: 
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s±n<#>1  -  xA 

cos^  -  y1/ 

+  r2  sin<j>2  - 
+  r2  cos4>2  - 

+  r^  sin<(>2  - 
+  cos4>2  - 

+  r4  sin<|>4  - 
+  r4  cos<f>4  - 

+  r5  sin 4»5  \ 
+  r,.  cosij),.  / 


> 


|  (6.2) 


The  corresponding  constraint  equations  are  obtained  by  substituting  in 
Eq.  (6.2)  the  corrected  distances  and  angles  for  the  observed  ones.  Now, 
we  have  s»5,  r=*10,  n=10,  and  p*8.  According  to  Eq.  (5.2)  no  additional 
parameters  can  be  introduced.  In  this  formulation,  the  Jacobian  matrix 
of  the  model  function  has  the  block  diagonal  form 


9(r5,*5) 


with  2x2  matrices  in  the  diagonal.  Hence,  the  problem  partitions 
into  five  subsets. 
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We  note  in  passing  that  in  this  example  the  reformulation  of  tue 
model  equations  has  only  slightly  reduced  the  size  of  the  problem.  In 
the  original  formulation,  we  would  have  to  deal  with  10  x  10  matrices  in 
the  normal  equations,  corresponding  to  the  ten  observations.  The  par¬ 
titioning  has  reduced  this  part  of  the  formulation  to  five  2x2  matrices, 
which  can  be  handled  much  easier.  However,  we  have  also  introduced 
eight  parameters  and  corresponding  8x8  matrices  in  the  normal  equations. 
Hence,  the  reduction  of  matrix  sizes  is  only  from  10  x  10  to  8  x  8.  If 
numerical  efficiency  were  important  in  this  example,  then  one  would  in¬ 
troduce  fewer  than  eight  parameters,  e.g.,  only  the  coordinates  of 
Stations  1  and  3.  In  that  formulation,  the  largest  matrl..  to  be  handled 
would  be  only  4x4. 

Numerical  values  of  the  observations  are  given  in  Table  1,  together 
with  the  results  of  the  adjustments  which  were  calculated  with  the  com¬ 
puter  program  COLSMU  of  Reference  10.  The  parameter  values,  i.e.,  the 
station  coordinates  are  listed  in  Table  2,  and  Table  3  provides  the  cor¬ 
relation  coefficients  between  the  station  coordinates.  The  correlation 
coefficient  matrix  C  is  defined  in  terms  of  the  variance-covariance 
matrix  V  by 

C  =  D-1/2  Vfc  D-1/2  (6.4) 

where 

D  -  diag  Vt  .  (6.5) 


TABLE 

1 .  OBSERVATIONS 

AND  ADJUSTMENTS 

Nr 

r  (km) 

e 

r 

c 

r 

r+c 

r 

<K°) 

c4> 

*+c<(, 

1 

10.5 

0.47 

0.356 

10.856 

77.0 

1.0 

0.33 

76.67 

2 

2.9 

0.25 

-0.148 

2.752 

202.0 

1.0 

-0.01 

201.99 

3 

4.6 

0.32 

-0.107 

4.493 

273.0 

1.0 

0.17 

273.17 

4 

7.0 

0.40 

-0.199 

6.801 

151.0 

1.0 

-0.24 

150.76 

5 

10.1 

0.46 

0.045 

10.145 

304.0 

1.0 

0.42 

304.42 
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TABLE  2.  ADJUSTED  COORDINATES  OF  STATIONS  WITH  ESTIMATED  STANDARD  ERRORS 


Nr 

x  (km) 

e  (km) 

X 

y  (km) 

ey(km) 

1 

10.563 

0.324 

2.503 

0.183 

2 

9.533 

0.306 

-0.049 

0.228 

3 

5.047 

0.286 

0.199 

0.237 

4 

8.369 

0.310 

-5.735 

0.242 

Weighted  sum  of  correction  squares  W  =  1.665  635  46 

Standard  error  with  weight  one  m =  0.912  589 

The  factor  m  is  not  included  in  the  standard  errors 
of  station  coordinates. 


TABLE  3.  CORRELATION  MATRIX  OF  ADJUSTED  STATION  COORDINATES 


X1 

yl 

x2 

y2 

x4 

y3 

X4 

y4 

X1 

1.0000 

0.2080 

0.9572 

-0.1477 

0.5516 

-0. 1303 

0.4494 

-0.1801 

yl 

0.2080 

1.0000 

0.1332 

0.5161 

-0.0594 

0.4872 

0.0394 

0.1454 

X2 

0.9572 

0.1332 

1.0000 

0.0053 

0.5358 

0.0141 

0.4626 

-0.1375 

y2 

-0.1477 

0.5161 

0.0053 

1.0000 

-0.2509 

0.9417 

-0.0407 

0.3170 

X3 

0.5516 

-0.0594 

0.5358 

-0.2509 

1.0000 

-0.3059 

0.8018 

-0.3450 

y3 

-0.1303 

0.4872 

0.0141 

0.9417 

-0.3059 

1.0000 

-0.0771 

0.3470 

XA 

0.4494 

0.0394 

0.4626 

-0.0407 

0.8018 

-0.0771 

1.000 

-0.6092 

4  -0.1801  0.1454  -0.1375  0.3170  -0.3450 


0.3470  -0.6092 


1.0000 


1 


Both  matrices,  C  and  Vt,  were  calculated  by  the  cited  utility  rou¬ 
tine  which  uses  Eq.  (3.5)  for  the  calculation  of  Vt. 

The  adjusted  positions  of  the  stations  are  shown  in  Figure  1 
together  with  corresponding  two  standard  error  ellipses  indicating  the 
accuracies  of  the  positions. 

Estimates  of  variances  and  covariances  of  the  adjusted  survey  sta¬ 
tions  are  important  when  the  stations  are  used  as  bases  for  other  mea¬ 
surements.  As  an  example,  let  us  assume  that  Stations  2  and  3  are  used 
to  determine  the  position  of  a  target  by  azimuth  measurements.  Let 
\\>2  and  i>2  be  the  azimuths  observed  from  Stations  2  and  3,  respectively. 
Then  the  target  is  given  by 


sin\|i2 

Xt  "  X2  sin  (^  -  \p2) 
cose 

yt  "  y2  sin  (\jJ3  -  ip2) 


Estimated  variances  and  covariances  of  the  target  coordinates  can 
be  computed  from  the  estimated  accuracies  of  the  azimuth  observations 
and  the  variances  and  covariances  of  the  bases'  coordinates  by  applying 
the  linearized  law  of  variance  propagation  to  Eq.  (6.6).  The  results 
are  shown  in  Table  4  and  Figure  1.  As  expected,  one  obtains  different 
estimates  of  the  target  accuracies,  depending  whether  the  correlations 
between  stations  are  taken  into  account  or  not. 


(x3  -  x2)  cosi^3  +  (y3  -  y2)  sini^ 


i(6.6) 


(x3  -  x2)  cos^3  +  (y3  -  Y2)  sin^3 


TABLE  4.  TARGET  POSITION 


OBSERVED  AZIMUTHS  OF  TARGET 
i>2  -  149.0  +  0.3 
if>3  -  123.0  +  0.3 

COMPUTED  COORDINATES  OF  TARGET 

a.  All  Covariances  Considered 

xt  -  12.159  +  0.437 
y  -  -4.419  +  0.337 

Correlation  coefficient  c  =  -0.667256 

xy 

b.  Covariances  Between  Stations  Neglected 

xt  -  12.159  +  0.609 
y  -  -4.419  +  0.555 

Correlation  coefficient  c  =  -0.906523 

xy 

c.  All  Covariances  Neglected 

xt  =  12.159  +  0.629 
y  =  -4.419  +  0.616 

Correlation  coefficient  c_^  »  -0.900271 


2 


Figure  1.  Plan ~1  metric  Traverse 


V  -  Reference  station 
■  -  original  survey  station 
4  -  adjusted  survey  station 

+  -  target,  determined  by  azimuth  measurements  from  Stations  2  and  3 


The  difference  between  Stations  A  and  A'  is  the  closure  error  of  the 
traverse.  A  and  A'  coincide  after  adjustment.  Accuracies  of  the  adjusted 
stations  and  of  the  target  are  indicated  by  two  standard  error  ellipses. 
The  larger  error  ellipse  around  the  target  is  obtained  if  correlations 
between  station  coordinates  are  neglected.  (Case  (c)  in  Table  4.)  Data 
are  given  in  Tables  1  through  4, 
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APPENDIX  A 

ITERATION  FORMULAS 


I 

1 


We  provide  a  set  of  iteration  formulas  that  are  derived  from  the 
Newton  Eqs.  (3.1)  by  algebraic  manipulations.  First,  we  define  the 
following  matrices: 


G  = 

(A.  1) 

A  * 

efxTgfx  -  1 

(A.  2) 

r  = 

[I  +  ar(ktf)^]-1 

(A.  3) 

E 

o 

=  r  •  [AC  -  rfxtgfx1 

(A. 4) 

V 

=  r  •  [RFxTGFt  +  AR(KTF)xtl 

(A. 5) 

D 

o 

-  <*>«  -  FtTGFxR<KTr)xx 

(A.  6) 

V 

■  '<KTp)tt  -  ^V^xt 

(A.  7) 

N  = 

F^GF.  -  D.  +  DE1 
t  t  1  0  1 

(A. 8) 

iteration  equations  are 

Nt 

■  FtTG(Fxc  -  F)  +  “oE0 

(A.  9) 

K  + 

K  =  G(FXC-F)  +  G[Ft  +  FxR(KTF)Xt]  x 

-  GF^O^Fj^e 

(A. 10) 

e  = 

Eo  ‘  EiT  • 

(A. 11) 
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Numerical  experiments  have  shown  that  the  convergence  of  the  iteration 
is  enhanced  if  the  equations  are  used  in  a  subiteration  mode  by  iterating 
alternatively  on  the  parameters  and  residuals,  respectively.  For  param¬ 
eter  subiteration  only  Eqs.  (A. 9)  and  (A. 10)  are  used,  assuming  e  =  0. 

For  residual  subiteration  one  sets  t^O  and  uses  Eqs.  (A. 10)  and  (A. 11). 

In  the  variance  formula  (3.5)  one  uses  N,  defined  by  Eq.  (A. 8)  and 


S  =  FtTGFX  +  VA  *  (A.  12) 

Another  equivalent  set  of  Newton-Raphson  iteration  equations  is 
given  in  Reference  7.  None  of  the  sets  is  numerically  superior  to  the 
other,  and  both  require  subiterations  of  parameters  and  residuals  for 
efficiency. 

Gauss-Newton  iteration  equations  can  be  obtained  from  Newton-Raphson 
iteration  equations  by  setting  all  second  order  derivatives  equal  to 
zero.  The  convergence  of  Gauss-Newton  algorithms  is  inferior,  but  in 
some  applications  they  have  a  larger  domain  of  convergence. 
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LIST  OF  SYMBOLS 


c,  C  -  residuals 

f,  F  -  constraint  (model)  functions 

k,  K  -  Correlates  (Lagrange  multipliers) 

M  -  norm  matrix 

n  -  dim  X  (total  number  of  observations) 

N  -  coefficient  matrix  for  variance  estimation  of  t 

p  -  dim  t  (number  of  parameters) 

r  -  dim  F  (number  of  scalar  constraint  equations) 

R  -  estimated  variance-covariance  matrix  of  the  observations  X 
s  -  number  of  subsets  in  a  partitionable  problem 

S  -  influence  matrix  for  variance  estimation  of  t 

t,  T  -  parameters 

x,  X  -  observables 

Vt  -  estimated  variance-covariance  matrix  of  t 

T  -1 

W  -  object  function  =  c  R  c 
fa  -  modified  object  function 

e  -  correction  of  C 

k  -  correction  of  K 

t  -  correction  of  T 
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